Search CORE

4 research outputs found

Genetic Methods for Machine Learning Models: The Case of Financial Time Series Forecasting

Author: Muñoz-Elguezábal Juan F.
Publication venue: 'ITESO, A.C.'
Publication date: 01/05/2021
Field of study

Financial time series forecasting certainly is the case of a predictive modeling process with many challenges, mainly because the temporal structure of the data. Genetic programming, as a particular variation of genetic algorithms, can be used to as a feature engineering, importance and selection process all at once, it can provide highly interpretable symbolic features that have low colinearity among them and yet high correlation with a target variable. We present the use of such method for generating symbolic features from endogenous linear and autoregressive variables, along with a Multi-Layer Perceptron, to construct a binary predictor for the price of Continuous Future Contracts of the Usd/Mxn intra-day exchange rate. The proposition of this work is three fold, first is stated a variation to formulate the classical regression problem of forecasting a continuous value, into a classification problem of forecasting a discrete and binary value, also, in order to address the feature engineering step, the use of Genetic Programming is proposed for producing non linear variables highly correlated with a target and highly uncorrelated with each other, and finally, variations on the performance metrics and Folds of data to perform the training process are implemented. The results are presented for a Logistic regression and a Multi-Layer Perceptron applied to 6 years of historical prices for the UsdMxn Financial Future contract

Repositorio Institucional del ITESO

T-fold sequential-validation technique for out-of-distribution generalization with financial time series data

Author: Muñoz-Elguezábal Juan F.
Sánchez-Torres Juan D.
Publication venue: International Conference on Econometrics and Statistics
Publication date: 01/06/2021
Field of study

The temporal structure in financial time series (FTS) data demands non-trivial considerations in the use of cross-validation (CV). Such frequently used technique is based on statistical learning theory, which is founded on the assumption that training samples are i.i.d. Although there is progress in studying fundamental phenomenons in certain learning methods such as feature selection imbalance during the learning stage, it is currently widely accepted that there will be no reason to expect good out of sample results from a learning process without such strong assumption. In FTS, there are conditions under which sub-sampling data leads to overshadow the effect of non-deterministic relationships between features and the target variable among different samples. Such effect remains unnoticed given the use of the additivity property in the decomposition of objective functions for the Learning Process. Moreover, it reduces to a particular operation the relationship among samples without information attribution. We present a technique that controls information leakage and decomposes the global probability distribution into local probability distributions, providing identification of each sample contribution to the learning process, maintaining information sparsity, therefore, relaxing the effects of the i.i.d. assumption. Parametric stability, as a result, is presented for exchange rate prediction using different predictive models.ITESO, A.C

Repositorio Institucional del ITESO

Clustering subsecuencial de series de tiempo: Evidencia de patrones temporales en el tipo de cambio UsdMxn

Author: Muñoz-Elguezábal Juan F.
Ruiz-Cruz Riemann
Publication venue: Centro de Investigación en Matemáticas, AC (CIMAT)
Publication date: 02/11/2020
Field of study

Este trabajo es sobre Clustering SubSecuencial de Series de Tiempo, una técnica que busca agrupar subsecuencias contenidas dentro de una misma serie de tiempo, por medio del cálculo de un término de distancia euclidiana a manera de medida de similitud entre los datos. Hacemos uso del algoritmo MASS (Mueen's Algorithm for Similarity Search), para la identificación de patrones temporales en la subsecuencia de precios intradía del tipo de cambio Dólar americano Vs Peso Mexicano (UsdMxn). Una búsqueda quasi-exhaustiva de evidencia es conducida utilizando 10 años de información, 14.5 Millones de precios (OHLC de cada minuto), 36,000 mediciones de indicadores macroeconómicos. Los resultados que mostramos son consistentes y documentamos las condiciones bajo las cuales no se cumple la Hipótesis del Mercado Eficiente.ITESO, A.C

Repositorio Institucional del ITESO

Modelos de predicción en empresas y gobierno mediante aprendizaje estadístico

Author: Ledesma-Elorriaga Rodrigo
Muñoz-Elguezábal Juan F.
Velázquez-Manzanero Zurisadai
Publication venue
Publication date: 01/12/2015
Field of study

Proyecto de Aplicación Profesional que se basa en el área de conocimiento de Aprendizaje Computacional, que consiste en utilizar estadística y programación para construir modelos matemáticos, algunos de forma cerrada (clasificados como analíticos) y otros de forma abierta (clasificados como computacionales). En este proyecto se abordaron cuatro: Support Vector Machines, Árboles de decisión y variantes, Random Forest y Procesamiento de Lenguaje Natural. Como apoyo, se utilizó el sistema de control de versiones GIT y como plataforma de almacenamiento en la nube, se recurrió a GITHUB

Repositorio Institucional del ITESO